
Take a quick glance of this case study and
analysis.
What kind of problem are we solving, what is the
story?
- We aim to segment the different species of penguins in the Palmer
Penguins dataset by analyzing various features such as bill length, bill
depth, flipper length, and body mass. This involves determining distinct
clusters within the dataset that group similar penguin species together.
This study forms part of a project for the UC San Diego, MSBA program;
MGTA 495 - Marketing Analytics course. You can see other segmentation
using K-means in Supply Chain Analysis: Late Order Acknowledgement
project as well.
What is the goal of the analysis?
- The goal is to write the k-means clustering algorithm, visualize its
steps, and compare the results with the built-in k-means function in R.
Additionally, we aim to calculate and plot the
within-cluster-sum-of-squares and silhouette scores for different
numbers of clusters to determine the optimal number of clusters
suggested by these metrics.